智能论文笔记

Self-adaptive algorithms for quasiconvex programming and applications to machine learning

Thang Tran Ngoc , Hai Trinh Ngoc

分类：机器学习

2022-12-13

For solving a broad class of nonconvex programming problems on an unbounded constraint set, we provide a self-adaptive step-size strategy that does not include line-search techniques and establishes the convergence of a generic approach under mild assumptions. Specifically, the objective function may not satisfy the convexity condition. Unlike descent line-search algorithms, it does not need a known Lipschitz constant to figure out how big the first step should be. The crucial feature of this process is the steady reduction of the step size until a certain condition is fulfilled. In particular, it can provide a new gradient projection approach to optimization problems with an unbounded constrained set. The correctness of the proposed method is verified by preliminary results from some computational examples. To demonstrate the effectiveness of the proposed technique for large-scale problems, we apply it to some experiments on machine learning, such as supervised feature selection, multi-variable logistic regressions and neural networks for classification.

translated by 谷歌翻译

Improving Pareto Front Learning via Multi-Sample Hypernetworks

Long Phi Hoang , Dung Duy Le , Tuan Anh Tran , Thang Tran Ngoc

分类：机器学习

2022-12-02

Pareto Front Learning (PFL) was recently introduced as an effective approach to obtain a mapping function from a given trade-off vector to a solution on the Pareto front, which solves the multi-objective optimization (MOO) problem. Due to the inherent trade-off between conflicting objectives, PFL offers a flexible approach in many scenarios in which the decision makers can not specify the preference of one Pareto solution over another, and must switch between them depending on the situation. However, existing PFL methods ignore the relationship between the solutions during the optimization process, which hinders the quality of the obtained front. To overcome this issue, we propose a novel PFL framework namely \ourmodel, which employs a hypernetwork to generate multiple solutions from a set of diverse trade-off preferences and enhance the quality of the Pareto front by maximizing the Hypervolume indicator defined by these solutions. The experimental results on several MOO machine learning tasks show that the proposed framework significantly outperforms the baselines in producing the trade-off Pareto front.

translated by 谷歌翻译

Multiple Perturbation Attack: Attack Pixelwise Under Different $\ell_p$-norms For Better Adversarial Performance

Ngoc N. Tran , Anh Tuan Bui , Dinh Phung , Trung Le

分类：计算机视觉 | 机器学习

2022-12-05

Adversarial machine learning has been both a major concern and a hot topic recently, especially with the ubiquitous use of deep neural networks in the current landscape. Adversarial attacks and defenses are usually likened to a cat-and-mouse game in which defenders and attackers evolve over the time. On one hand, the goal is to develop strong and robust deep networks that are resistant to malicious actors. On the other hand, in order to achieve that, we need to devise even stronger adversarial attacks to challenge these defense models. Most of existing attacks employs a single $\ell_p$ distance (commonly, $p\in\{1,2,\infty\}$) to define the concept of closeness and performs steepest gradient ascent w.r.t. this $p$-norm to update all pixels in an adversarial example in the same way. These $\ell_p$ attacks each has its own pros and cons; and there is no single attack that can successfully break through defense models that are robust against multiple $\ell_p$ norms simultaneously. Motivated by these observations, we come up with a natural approach: combining various $\ell_p$ gradient projections on a pixel level to achieve a joint adversarial perturbation. Specifically, we learn how to perturb each pixel to maximize the attack performance, while maintaining the overall visual imperceptibility of adversarial examples. Finally, through various experiments with standardized benchmarks, we show that our method outperforms most current strong attacks across state-of-the-art defense mechanisms, while retaining its ability to remain clean visually.

translated by 谷歌翻译

LG-Hand: Advancing 3D Hand Pose Estimation with Locally and Globally Kinematic Knowledge

Tu Le-Xuan , Trung Tran-Quang , Thi Ngoc Hien Doan , Thanh-Hai Tran

分类：计算机视觉

2022-11-06

3D hand pose estimation from RGB images suffers from the difficulty of obtaining the depth information. Therefore, a great deal of attention has been spent on estimating 3D hand pose from 2D hand joints. In this paper, we leverage the advantage of spatial-temporal Graph Convolutional Neural Networks and propose LG-Hand, a powerful method for 3D hand pose estimation. Our method incorporates both spatial and temporal dependencies into a single process. We argue that kinematic information plays an important role, contributing to the performance of 3D hand pose estimation. We thereby introduce two new objective functions, Angle and Direction loss, to take the hand structure into account. While Angle loss covers locally kinematic information, Direction loss handles globally kinematic one. Our LG-Hand achieves promising results on the First-Person Hand Action Benchmark (FPHAB) dataset. We also perform an ablation study to show the efficacy of the two proposed objective functions.

translated by 谷歌翻译

The Who in Code-Switching: A Case Study for Predicting Egyptian Arabic-English Code-Switching Levels based on Character Profiles

Injy Hamed , Alia El Bolock , Cornelia Herbert , Slim Abdennadher , Ngoc Thang Vu

分类：自然语言处理

2022-07-31

代码转换（CS）是多语言个体所表现出的常见语言现象，在一次对话中，它们倾向于在语言之间交替。 CS是一种复杂的现象，不仅包含语言挑战，而且还包含大量的复杂性，就其在说话者之间的动态行为而言。鉴于产生CS的因素因一个国家而异，并且从一个人到另一个人都不同，因此发现CS是一种依赖说话者的行为，在该行为中，外语被嵌入的频率在说话者之间有所不同。尽管几位研究人员从语言的角度研究了CS行为，但研究仍然缺乏从社会学和心理学角度预测用户CS行为的任务。我们提供了一项经验用户研究，我们研究用户的CS级别和性质特征之间的相关性。我们对双语者进行访谈，并收集有关他们的个人资料的信息，包括他们的人口统计学，个性特征和旅行经验。然后，我们使用机器学习（ML）根据其配置文件来预测用户的CS级别，在此我们确定建模过程中的主要影响因素。我们试验分类和回归任务。我们的结果表明，CS行为受到说话者之间的关系，旅行经验以及神经质和外向性人格特征的影响。

translated by 谷歌翻译

PoeticTTS -- Controllable Poetry Reading for Literary Studies

Julia Koch , Florian Lux , Nadja Schauffler , Toni Bernhart , Felix Dieterle , Jonas Kuhn , Sandra Richter , Gabriel Viehhauser , Ngoc Thang Vu

分类：自然语言处理 | 机器学习

2022-07-11

诗歌的语音综合是由于诗意语音固有的特定语调模式而具有挑战性的。在这项工作中，我们提出了一种将诗歌与几乎像人类一样自然的综合诗作的方法，以使文学学者能够系统地检查有关文本，口头实现和听众对诗歌的相互作用的假设。为了满足文学研究的这些特殊要求，我们通过从人类参考朗诵中克隆韵律价值来重新合成诗，然后利用细粒度的韵律控制来操纵在人类的环境中的合成语音以改变朗诵W.R.T.具体现象。我们发现，对诗歌的TTS模型进行鉴定会在很大程度上捕捉诗歌语调模式，这对韵律克隆和操纵是有益的，并在客观评估和人类研究中都验证了我们方法的成功。

translated by 谷歌翻译

Speaker Anonymization with Phonetic Intermediate Representations

Sarina Meyer , Florian Lux , Pavel Denisov , Julia Koch , Pascal Tilli , Ngoc Thang Vu

分类：机器学习

2022-07-11

在这项工作中，我们提出了一个说话者的匿名管道，该管道利用高质量的自动语音识别和合成系统来生成以语音转录和匿名扬声器嵌入为条件的语音。使用电话作为中间表示，可确保从输入中完全消除说话者身份信息，同时尽可能保留原始的语音内容。我们在Librispeech和VCTK Corpora上的实验结果揭示了两个关键发现：1）尽管自动语音识别会产生不完美的转录，但我们的神经语音合成系统可以处理此类错误，使我们的系统可行且健壮，并且2）结合来自不同资源的扬声器嵌入，有益及其适当的归一化至关重要。总体而言，我们的最终最佳系统在2020年语音隐私挑战挑战中提供的基线在与懒惰的攻击者的稳健性方面相当大，同时保持了匿名语音的高度理解性和自然性。

translated by 谷歌翻译

Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech

Florian Lux , Julia Koch , Ngoc Thang Vu

分类：自然语言处理

2022-06-24

使用未转录的参考样本来克隆说话者的声音是现代神经文本到语音（TTS）方法的巨大进步之一。最近还提出了模仿转录参考音频的韵律的方法。在这项工作中，我们首次将这两项任务与话语级别的扬声器嵌入在一起，首次将这两个任务融合在一起。我们进一步引入了一个轻巧的对准器，用于提取细粒度的韵律特征，可以在几秒钟内对单个样品进行填充。我们表明，正如我们的客观评估和人类研究表明，我们可以独立地独立地独立语言参考的声音以及与原始声音和韵律高度相似的韵律的韵律，正如我们的客观评估和人类研究表明。我们的所有代码和训练有素的模型都可以以及静态和交互式演示。

translated by 谷歌翻译

Toward the smooth mesh climbing of a miniature robot using bioinspired soft and expandable claws

Hong Wang , Peng Liu , Phuoc Thanh Tran Ngoc , Bing Li , Yao Li , Hirotaka Sato

分类：机器人

2022-06-15

尽管大多数微型机器人在坚固耐用的地形上都面临困难，但甲虫可以在复杂的底物上平稳行走而不会滑倒或粘在地面上，因为它们的刚度可变可变的塔西（Tarsi）和可在塔西（Tarsi）的尖端上伸展的钩子。在这项研究中，我们发现甲虫会积极弯曲并定期扩大爪子以在网状表面上自由爬行。受甲虫的爬行机制的启发，我们设计了一个8厘米的微型攀岩机器人，以与天然甲虫相同的循环方式打开和弯曲的人造爪。机器人可以在网格表面上以可控步态自由攀爬，陡峭的斜角60 {\ deg}，甚至过渡表面。据我们所知，这是第一个可以同时攀登网格表面和悬崖倾斜的微型机器人。

translated by 谷歌翻译

Binarizing Split Learning for Data Privacy Enhancement and Computation Reduction

Ngoc Duy Pham , Alsharif Abuadbba , Yansong Gao , Tran Khoa Phan , Naveen Chilamkurti

分类：机器学习

2022-06-10

Split学习（SL）通过允许客户在不共享原始数据的情况下协作培训深度学习模型来实现数据隐私保护。但是，SL仍然有限制，例如潜在的数据隐私泄漏和客户端的高计算。在这项研究中，我们建议将SL局部层进行二线以进行更快的计算（在移动设备上的培训和推理阶段的前进时间少17.5倍）和减少内存使用情况（最多减少32倍的内存和带宽要求）。更重要的是，二进制的SL（B-SL）模型可以减少SL污染数据中的隐私泄漏，而模型精度的降解仅小。为了进一步增强隐私保护，我们还提出了两种新颖的方法：1）培训额外的局部泄漏损失，2）应用差异隐私，可以单独或同时集成到B-SL模型中。与多种基准模型相比，使用不同数据集的实验结果肯定了B-SL模型的优势。还说明了B-SL模型针对功能空间劫持攻击（FSHA）的有效性。我们的结果表明，B-SL模型对于具有高隐私保护要求（例如移动医疗保健应用程序）的轻巧的物联网/移动应用程序很有希望。

translated by 谷歌翻译